Àá½Ã¸¸ ±â´Ù·Á ÁÖ¼¼¿ä. ·ÎµùÁßÀÔ´Ï´Ù.
KMID : 0644020190320010061
Journal Of Korean Medical Classics
2019 Volume.32 No. 1 p.61 ~ p.74
Comparison of Word Embedding Techniques in Traditional Korean Medicine for Analysis of Data : implementation of natural language processing method
Oh Jun-Ho

Abstract
Objectives : The purpose of this study is to help select an appropriate word embedding method when analyzing East Asian traditional medicine texts as data.

Methods : Based on prescription data that imply traditional methods in traditional East Asian medicine, we have examined 4 count-based word embedding and 2 prediction-based word embedding methods. In order to intuitively compare these word embedding methods, we proposed a "prescription generating game" and compared its results with those from the application of the 6 methods.

Results : When the adjacent vectors are extracted, the count-based word embedding method derives the main herbs that are frequently used in conjunction with each other. On the other hand, in the prediction-based word embedding method, the synonyms of the herbs were derived.

Conclusions : Counting based word embedding methods seems to be more effective than prediction-based word embedding methods in analyzing the use of domesticated herbs. Among count-based word embedding methods, the TF-vector method tends to exaggerate the frequency effect, and hence the TF-IDF vector or co-word vector may be a more reasonable choice. Also, the t-score vector may be recommended in search for unusual information that could not be found in frequency. On the other hand, prediction-based embedding seems to be effective when deriving the bases of similar meanings in context.
KEYWORD
Word Embedding, East Asian Traditional Medicine, Korean Medicine, Data Analysis, Natural Language Processing
FullTexts / Linksout information
Listed journal information
ÇмúÁøÈïÀç´Ü(KCI)